skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Kunho Kim, Shaurya Rohatgi"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Author name disambiguation (AND) can be defined as the problem of clustering together unique authors from all author mentions that have been extracted from publication or related records in digital libraries or other sources. Pairwise classification is an essential part of AND, and is used to estimate the probability that any pair of author mentions belong to the same author. Previous studies trained classifiers with features manually extracted from each attribute of the data. Recently, others trained a model to learn a vector representation from text without considering any structure information. Both of these approaches have advantages. The former method takes advantage of the structure of data, while the latter takes into account the textual similarity across attributes. Here, we introduce a hybrid method which takes advantage of both approaches by extracting both structure-aware features and global features. In addition, we introduce a novel way to train a global model utilizing a large number of negative samples. Results on AMiner and PubMed data shows the relative improvement of the mean average precision (MAP) by more than 7.45% when compared to previous state-of-the-art methods. 
    more » « less